The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Phylogenetics, or the inference of evolutionary trees, is one of the oldest and most intensively studied topics in computational biology. Yet it remains a vibrant area of research, in part because advances in our ability to gather data for phylogenetic inference continue to create novel and more challenging variants of the phylogeny problem. In this talk, I will discuss a particular challenge underlying...
Multiple alignment of the gene orders in sequenced genomes is an important problem in comparative genomics [1]. A key aspect is the construction of disjoint orthology sets of genes, in which each element is orthologous to all other genes (on different genomes) in the same set. Approaches differ as to the nature and timing and relative importance of sequence alignment, synteny block construction, and...
The rapidly increasing number of sequenced genomes offers the chance to resolve longstanding questions about the evolutionary history of certain groups of organisms, to develop a better understanding of evolution, to make substantial advances in functional genomics, and to start bridging genomics and genetics. Comparative genomics is the term used today for much of the work carried out in whole-genome...
For most neuropsychiatric disorders there is a lack of good cellular model and animal model. Thus genetic and pharmacogenetic studies of these complex disorders are powerful approaches to identify the underlying genes and pathways. Recent advances in high-throughput sequencing technologies enable such studies at genome scale. Due to the huge amount of data involved, bioinformatic research and applications...
Genome-wide association (GWA) studies have recently become popular as a tool for identifying genetic variables that are responsible for increased disease susceptibility. A modern statistical method for approaching this problem is through model selection (or structure estimation) of Structured Input-Output Regression Models (SIORM) fitted on genetic and phenotypic variation data across a large number...
Identifying essential proteins is important for understandingthe minimal requirements for cellular survival and development.Numerouscomputational methodshave been proposed to identify essential proteins from protein-protein interaction (PPI) network.However most of methods only use the PPI network topology information. HartGT indicated that essentiality is a product of theprotein complex rather than...
Identifying the location of binding sites on proteins is of fundamental importance for a wide range of applications including molecular docking, de novo drug design, structure identification and comparison of functional sites. Structural genomic projects are beginning to produce protein structures with unknown functions. Therefore, efficient methods are required if all these structures are to be properly...
The recent interest in function of various RNA structures, reflected in the growth of solved RNA structures in PDB, calls for methods for effective and efficient similarity search in RNA structural databases. Here, we propose a method called SETTER (RNA SEcondary sTructure-based TERtiary structure similarity) based on partitioning of RNA structures into so-called generalized secondary structure units...
Essential genes are indispensable for an organism’s living. These genes are widely discussed, and many researchers proposed prediction methods that not only find essential genes but also assist pathogens discovery and drug development. However, few studies utilized the relationship between gene functions and essential genes for essential gene prediction. In this paper, we explore the topic of essential...
Three-dimensional (3D) reconstruction of electron tomography (ET) has emerged as a leading technique to elucidate the molecular structures of complex biological specimens. Blob-based iterative methods are advantageous reconstruction methods for 3D reconstruction of ET, but demand huge computational costs. Multiple Graphic processing units (multi-GPUs) offer an affordable platform to meet these demands,...
RNA interactions are fundamental to a multitude of cellular processes including post-transcriptional gene regulation. Although much progress has been made recently at developing fast algorithms for predicting RNA interactions, much less attention has been devoted to the development of efficient algorithms and data structures for locating RNA interaction patterns. We present two algorithms for...
Identification of essential proteins is key to understanding the minimal requirements for cellular life and important for drug design. Rapid increasing of available protein-protein interaction data has made it possible to detect protein essentiality on network level. A series of centrality measures have been proposed to discover essential proteins based on network topology. However, most of them tended...
Based on the gene order of four core eudicot genomes (cacao, castor bean, papaya and grapevine) that have escaped any recent whole genome duplication (WGD) events, and two others (poplar and cucumber) that descend from independent WGDs, we infer the ancestral gene order of the rosid clade and those of its main subgroups, the fabids and malvids. We use the gene order evidence to evaluate the hypothesis...
Detecting essential multiprotein modules that change infrequently during evolution is a challenging algorithmic task that is important for understanding the structure, function, and evolution of the biological cell. In this paper, we present a linear-time algorithm, Produles, that improves on the running time of previous algorithms. We present a biologically motivated graph theoretic set of algorithm...
A Maximum Agreement SubTree (MAST) is a largest subtree common to a set of trees and serves as a summary of common substructure in the trees. A single MAST can be misleading, however, since there can be an exponential number of MASTs, and two MASTs for the same tree set do not even necessarily share any leaves. In this paper we introduce the notion of the Kernel Agreement SubTree (KAST), which is...
Prediction of protein contact map is of great importance since it can facilitate and improve the prediction of protein 3D structure. However, the prediction accuracy is notoriously known to be rather low. In this paper, a consensus contact map prediction method called LRcon is developed, which combines the prediction results from several complementary predictors by using a logistic regression model...
Evolutionary methods are increasingly challenged by the fast growing resources of genomic sequence information. Fundamental evolutionary events, like gene duplication, loss, and deep coalescence, account more then ever for incongruence between gene trees and the actual species tree. Gene tree reconciliation is addressing this fundamental problem by invoking the minimum number of gene-duplication and...
We propose a computational method to comprehensively screen for pharmacogenomic pathway simulation models. A systematic model generation strategy is developed; candidate pharmacogenomic models are automatically generated from some prototype models constructed from existing literature. The parameters in the model are automatically estimated based on time-course observed gene expression data by data...
Phylogenetic methods must account for the biological processes that create incongruence between gene trees and the species phylogeny. Deep coalescence, or incomplete lineage sorting creates discord among gene trees at the early stages of species divergence or in cases when the time between speciation events was short and the ancestral population sizes were large. The deep coalescence problem takes...
A Robinson-Foulds (RF) supertree for a collection of input trees is a comprehensive species phylogeny that is at minimum total RF distance to the input trees. Thus, an RF supertree is consistent with the maximum number of splits in the input trees. Constructing rooted and unrooted RF supertrees is NP-hard. Nevertheless, effective local search heuristics have been developed for the restricted case...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.